Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.
Objective
The objective of this data visualization is to create awareness around the amount of inadequately managed waste across all countries of the world, visualize this data and investigate its correlation with the corresponding Gross Domestic Product (GDP) of these countries. The target audience of this visualization are the governments of the world, waste management officials and general public citizens.
The visualization chosen had the following three main issues:
Failure to answer a practical question: The visualization infers that GDP is not a good indicator of the amount of plastics being produced this is because GDP is an indicator to the total economic output of a country not the current circumstances of the country. Therefore, it is better to compare based on a country’s classification group. On doing so we can infer that countries in lower income group categories generally tend to have higher mismanaged waste.
Issues with data integrity: A lot of data has been lost as the source document has a unconventional naming system for countries (Eg: Faeroe Islands is misnamed as Faroe Islands), all such observations turn out to become NA values in this visual and it is very hard to identify these NA values (improper representation) in this plot (Sankey Diagram).
Perceptual Issue: There is a visual overload of information due to high number of observations that need to be addressed. The overlapping circles indicating the GDP cannot be easily perceived by the viewer and as the width of the observations decreases it is hard to compare observations make it hard to generate an actionable insight. The element coding such as text (bolded/unbolded), circle outline (bolded/unbolded) create a bombardment of visual elements to focus on for the viewer.
Reference
The following code was used to fix the issues identified in the original visualization.
# Importing Libraries here.
library(tidyverse)
library(psych)
library(sf)
library(leaflet)
library(scales)
# Import data into R environment.
df_plastics_raw <- read_csv("./data/newplastics.csv")
continents_data <- read_csv("./data/continents-according-to-our-world-in-data.csv")
gdp_data <- read_csv("./data/2010GDP.csv")
my_shapefile <- st_read("./data/world-administrative-boundaries/world-administrative-boundaries.shp")
## Reading layer `world-administrative-boundaries' from data source
## `/Users/chris/Library/CloudStorage/OneDrive-RMITUniversity/Semester 2/Visualization/Assignment 2/data/world-administrative-boundaries/world-administrative-boundaries.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 256 features and 8 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -180 ymin: -58.49861 xmax: 180 ymax: 83.6236
## Geodetic CRS: WGS 84
# Summarise structure of data
str(df_plastics_raw)
## spc_tbl_ [202 × 14] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ Country : chr [1:202] "Albania" "Algeria8" "Angola" "Anguilla" ...
## $ Economic status1 : chr [1:202] "LMI" "UMI" "LMI" "HIC" ...
## $ Coastal population2 : num [1:202] 2530533 16556580 3790041 14561 66843 ...
## $ Waste generation rate [kg/person/day]3 : num [1:202] 0.77 1.2 0.48 2.1 5.5 1.22 2.1 2.23 3.25 1.1 ...
## $ % Plastic in waste stream4 : num [1:202] 9 12 13 12 12 15 12 5 12 12 ...
## $ % Inadequately managed waste5 : num [1:202] 45 58 71 2 6 12 1 0 1 10 ...
## $ % Littered waste6 : num [1:202] 2 2 2 2 2 2 2 2 2 2 ...
## $ Waste generation [kg/day]7 : num [1:202] 1948510 19867896 1819220 30578 367637 ...
## $ Plastic waste generation [kg/day]7 : num [1:202] 174392 2374214 235589 3654 43933 ...
## $ Inadequately managed plastic waste [kg/day]7: num [1:202] 77897 1378693 166597 68 2555 ...
## $ Plastic waste littered
## [kg/day]7 : num [1:202] 3488 47484 4712 73 879 ...
## $ Mismanaged plastic waste [kg/person/day]7 : chr [1:202] "0.032" "0.086" "0.045" "0.010" ...
## $ Mismanaged plastic waste in 2010
## [tonnes]7: num [1:202] 29705 520555 62528 52 1253 ...
## $ Mismanaged plastic waste in 2025
## [tonnes]7: num [1:202] 63051 1017444 136770 73 1385 ...
## - attr(*, "spec")=
## .. cols(
## .. Country = col_character(),
## .. `Economic status1` = col_character(),
## .. `Coastal population2` = col_number(),
## .. `Waste generation rate [kg/person/day]3` = col_double(),
## .. `% Plastic in waste stream4` = col_double(),
## .. `% Inadequately managed waste5` = col_double(),
## .. `% Littered waste6` = col_double(),
## .. `Waste generation [kg/day]7` = col_number(),
## .. `Plastic waste generation [kg/day]7` = col_number(),
## .. `Inadequately managed plastic waste [kg/day]7` = col_number(),
## .. `Plastic waste littered
## .. [kg/day]7` = col_number(),
## .. `Mismanaged plastic waste [kg/person/day]7` = col_character(),
## .. `Mismanaged plastic waste in 2010
## .. [tonnes]7` = col_number(),
## .. `Mismanaged plastic waste in 2025
## .. [tonnes]7` = col_number()
## .. )
## - attr(*, "problems")=<externalptr>
# NA values are Notes written by Author. No other NA Values.
which(!complete.cases(df_plastics_raw))
## [1] 193 194 195 196 197 198 199 200 201 202
# Remove NA values at the bottom of data set as they are Notes (10 rows) written by the Author.
df_plastics <- df_plastics_raw %>% remove_missing()
# Edits in Plastics dataframe.
df_plastics <- df_plastics %>%
mutate(Country = gsub("8$", "", Country))
df_plastics$`Mismanaged plastic waste [kg/person/day]7` <- df_plastics$`Mismanaged plastic waste [kg/person/day]7` %>%
as.numeric()
df_plastics$`Economic status1` <- factor(df_plastics$`Economic status1`, levels = c("LI", "LMI", "UMI", "HIC"),
ordered = TRUE)
df_plastics <- df_plastics %>% mutate(`Economic status1` = case_when(
`Economic status1` == "LI" ~ "Lower Income",
`Economic status1` == "LMI" ~ "Lower Middle Income",
`Economic status1` == "UMI" ~ "Upper Middle Income",
`Economic status1` == "HIC" ~ "High Income",
TRUE ~ NA_character_
))
# Edits in Continent dataframe.
names(continents_data)[1] <- "Country"
# Data Preprocessing.
df_plastics$Country <- gsub("&", "and", df_plastics$Country)
df_plastics$Country <- gsub("[[:punct:]]", "", df_plastics$Country)
df_plastics$Country <- trimws(df_plastics$Country)
df_plastics$Country <- gsub("BurmaMyanmar", "Myanmar", df_plastics$Country)
continents_data$Country <- gsub("&", "and", continents_data$Country)
continents_data$Country <- gsub("[[:punct:]]", "", continents_data$Country)
continents_data$Country <- trimws(continents_data$Country)
df_plastics <- df_plastics %>% filter(Country != "Dhekelia")
old_names <- c("Congo Dem rep of", "Congo Rep of", "East Timor",
"Faroe Islands", "Korea North", "Korea South Republic of Korea",
"Micronesia", "Palestine Gaza Strip is only part on the coast",
"Saint Maarten DWI", "Saint Pierre", "Svalbard", "The Gambia",
"USVI")
new_names <- c("Democratic Republic of Congo", "Congo", "Timor", "Faeroe Islands",
"North Korea", "South Korea", "Micronesia country", "Palestine",
"Saint Martin French part", "Saint Pierre and Miquelon",
"Svalbard and Jan Mayen", "Gambia", "United States Virgin Islands")
df_plastics <- df_plastics %>%
mutate(Country = if_else(Country %in% old_names, new_names[match(Country, old_names)], Country))
df_plastics <- merge(df_plastics, continents_data, all.x = TRUE)
df_plastics <- df_plastics[, -which(names(df_plastics) == "Year")]
df_plastics$Continent <- df_plastics$Continent %>% as.factor()
# Edits in GDP data frame
gdp_data <- gdp_data %>% filter(Year == 2010)
# Final Data frame structure
str(df_plastics)
## 'data.frame': 191 obs. of 16 variables:
## $ Country : chr "Albania" "Algeria" "Angola" "Anguilla" ...
## $ Economic status1 : chr "Lower Middle Income" "Upper Middle Income" "Lower Middle Income" "High Income" ...
## $ Coastal population2 : num 2530533 16556580 3790041 14561 66843 ...
## $ Waste generation rate [kg/person/day]3 : num 0.77 1.2 0.48 2.1 5.5 1.22 2.1 2.23 3.25 1.1 ...
## $ % Plastic in waste stream4 : num 9 12 13 12 12 15 12 5 12 12 ...
## $ % Inadequately managed waste5 : num 45 58 71 2 6 12 1 0 1 10 ...
## $ % Littered waste6 : num 2 2 2 2 2 2 2 2 2 2 ...
## $ Waste generation [kg/day]7 : num 1948510 19867896 1819220 30578 367637 ...
## $ Plastic waste generation [kg/day]7 : num 174392 2374214 235589 3654 43933 ...
## $ Inadequately managed plastic waste [kg/day]7: num 77897 1378693 166597 68 2555 ...
## $ Plastic waste littered
## [kg/day]7 : num 3488 47484 4712 73 879 ...
## $ Mismanaged plastic waste [kg/person/day]7 : num 0.032 0.086 0.045 0.01 0.051 0.026 0.007 0.002 0.011 0.016 ...
## $ Mismanaged plastic waste in 2010
## [tonnes]7: num 29705 520555 62528 52 1253 ...
## $ Mismanaged plastic waste in 2025
## [tonnes]7: num 63051 1017444 136770 73 1385 ...
## $ Code : chr "ALB" "DZA" "AGO" "AIA" ...
## $ Continent : Factor w/ 6 levels "Africa","Asia",..: 3 1 1 4 4 6 4 5 4 2 ...
# Load world shape data
my_shapefile <- my_shapefile %>% filter(!name %in% c("Azores Islands", "Gaza Strip")) %>%
mutate(center = st_centroid(geometry))
# Merge plastics data frame with world shape data.
df_plastics <- merge(df_plastics, my_shapefile, by.x = "Code", by.y = "iso3", all.x = TRUE)
# Merge plastics data frame with GDP data.
df_plastics <- merge(df_plastics, gdp_data, by.x = "Code", by.y = "Code", all.x = TRUE)
sf_plastics <- df_plastics %>%
st_as_sf()
pal <- colorNumeric(palette = "YlOrBr", domain = sf_plastics$`% Inadequately managed waste5`)
pal2 <- colorNumeric(palette = "YlOrRd", domain = log(sf_plastics$`Plastic waste generation [kg/day]7`))
# Prepare the text for tool tips:
mytext <- paste(
"Country: ", sf_plastics$Country,"<br/>",
"Waste Generated: ", comma(sf_plastics$`Plastic waste generation [kg/day]7`, big.mark = ","), " kg/day<br/>",
"Mismanaged Waste: ", comma(sf_plastics$`Inadequately managed plastic waste [kg/day]7`, big.mark = ","), " kg/day<br/>",
"Mismanaged waste: ", sf_plastics$`% Inadequately managed waste5`, "%<br/>",
"Group: ", sf_plastics$`Economic status1`, "<br/>",
"GDP: ", comma(sf_plastics$`GDP (constant 2015 US$)`, big.mark = ","), "<br/>",
sep="") %>%
lapply(htmltools::HTML)
# Final Plot
plastics_plot <- leaflet() %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addPolygons(
data = sf_plastics
, fillColor = ~pal(`% Inadequately managed waste5`)
, stroke=TRUE
, fillOpacity = 1
, color="white"
, weight=0.3
, label = mytext
, labelOptions = labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "13px",
direction = "auto"
),
highlightOptions = highlightOptions(
weight = 1,
color = "black",
fillOpacity = 0.5,
bringToFront = TRUE
)
) %>%
addLegend("bottomleft", pal = pal, values = ~`% Inadequately managed waste5`,
title = "Mismanaged waste",
labFormat = labelFormat(suffix = "%"),
opacity = 1
, data = sf_plastics
)
Data References
Jambeck, J. R., Geyer, R., Wilcox, C., Siegler, T. R., Perryman, M., Andrady, A., Narayan, R., & Law, K. L. (2015). Plastic waste inputs from land into the Ocean. Science, 347(6223), 768–771. https://doi.org/10.1126/science.1260352
Gross domestic product (GDP). Our World in Data. (n.d.). Retrieved May 1, 2023, from https://ourworldindata.org/grapher/gross-domestic-product?time=2011
Continents according to our world in Data. Our World in Data. (n.d.). Retrieved May 1, 2023, from https://ourworldindata.org/grapher/continents-according-to-our-world-in-data
The following plot fixes the main issues in the original.